The AI-Driven Engineering Revolution
Engineering roles have undergone an unprecedented transformation in recent years. Engineers are increasingly evolving into tech leads who manage vast fleets of AI agents, a process that feels akin to casting spells that perform complex tasks on their behalf. Within OpenAI, this shift is profound; 95% of engineers use Codex on a daily basis, and 100% of pull requests are reviewed by the tool. For managers, utilizing these AI tools has become more efficient than manual coding, with nearly all code being generated by AI first.
The Golden Age of B2B SaaS
When considering the long-term impact of AI, many have yet to fully account for the second or third-order effects of the one-person billion-dollar startup. The ability to enable such lean, high-value organizations suggests a future where a hundred other small startups will be building bespoke software. This landscape points toward the potential entry into a golden age of B2B SaaS.
Challenges in an AI-Native Codebase
The move toward AI-centric development is not without its stresses. There is a palpable tension when agents fail to perform as expected. To better understand these challenges, a team at OpenAI is currently conducting an experiment by maintaining a codebase that is 100% written by Codex. They frequently encounter the very problems that arise when teams lose the traditional escape hatch of being able to roll up their sleeves and manually resolve complex issues.
Building for the Future of Models
Regarding product strategy in the AI space, listening strictly to customers is not always the correct approach because the field and the models themselves evolve at an extreme velocity. These models tend to disrupt their own previous iterations, effectively eating previous scaffolding for breakfast.
The primary advice for those looking to avoid missing the boat is to ensure they are building for where the models are going, rather than where they exist today. As Kevin Whale, VP of science at OpenAI, frequently notes, it is essential to remember that this is the worst the models will ever be.
Engineering Workflow at OpenAI
Internal software development at OpenAI has shifted significantly toward AI-first workflows. Currently, 95% of engineers use Codeex daily, and 100% of pull requests (PRs) are reviewed by the tool. Codeex actively monitors code merged into production, providing suggestions and improvement requests during the PR process.
Internal metrics indicate a distinct impact on productivity: engineers who utilize Codeex more heavily open 70% more PRs than those who use it less. This productivity gap is not static; it continues to widen over time as engineers refine their ability to collaborate with the model and increase their operational efficiency.
Shifting Roles and Trust
While there remains a transition period as some engineers adjust their trust levels regarding AI-generated output, the bar for that trust continues to rise. Frequent interactions with the model often leave engineers impressed, reinforcing a cycle where they entrust more tasks to the system as it evolves. This aligns with the perspective of Kevin Whale, the VP of Science, who frequently highlights that these models are currently at the worst they will ever be, implying that both model capabilities and human reliance on them will grow in tandem.
This shift has created a surreal professional environment, illustrated by instances such as the work of the developer behind Open Claw. The developer has noted that when the system performs tasks, he feels confident in the results—often to the point of being willing to commit the code to the main branch directly. Furthermore, observing AI agents interacting with one another has been described by team members as a surreal, real-life embodiment of speculative concepts.
The Future of Software Engineering
The role of the software engineer has undergone one of the most rapid transformations in recent years. Moving from a discipline where one writes every line of code to one where AI handles the generation, the profession is entering a new phase. Within the next 12 to 24 months, the industry will be defining its own standards for this new paradigm.
Engineers are increasingly functioning less as individual contributors writing syntax and more as tech leads or managers. Instead of focusing on single tasks, they are now managing fleets of agents. It is common for engineers at OpenAI to handle 10 to 20 parallel threads simultaneously. Their daily work involves steering these agents, checking in on progress, and providing iterative feedback, effectively shifting the core competency of the software engineer toward orchestration and high-level management of automated processes.
The Wizard Metaphor and the Evolution of Programming
A profound metaphor for software engineering, which remains remarkably relevant today, stems from the classic programming textbook Structure and Interpretation of Computer Programs (SICP), often referred to as the Wizard Book. Written in 1980, this text describes programming as a discipline akin to sorcery. It posits that software engineers are essentially wizards, and programming languages are the incantations they issue to command the computer to perform tasks. The central challenge of this discipline is crafting the precise incantation required to achieve the desired result.
This metaphor has persisted throughout the decades as the evolution of programming has consistently focused on making it easier to command computers. The current wave of artificial intelligence represents the latest, most literal stage of this evolution. Modern tools like Codex and Cursor enable engineers to provide exact, high-level instructions that the AI then executes, effectively turning programming into a process of issuing incantations.
The Sorcerer's Apprentice and Modern Engineering
The analogy of the Sorcerer's Apprentice from Fantasia is particularly apt for the current state of vibe coding. Like Mickey Mouse discovering the sorcerer's hat, modern engineers have access to extremely high-leverage tools that allow them to accomplish massive tasks with great ease. However, this power necessitates deep expertise; the plot of the story serves as a warning, as Mickey’s attempt to automate his chores with the enchanted brooms leads to chaos and flooding when he loses control.
In the context of modern development, a highly proficient senior engineer might manage 10 to 20 parallel threads of Codex jobs simultaneously. While this provides incredible leverage, it requires significant seniority and careful oversight to ensure the models do not go off the rails. The role has fundamentally shifted from writing individual lines of code to managing fleets of agents. The engineer must actively steer these processes, providing feedback and validation to prevent unintended side effects—often described as a monkey's paw scenario, where you receive exactly what you requested, but perhaps not with the consequences you anticipated.
Ultimately, this transition toward an AI-augmented workflow is transforming software development into an experience that feels truly magical. Engineers are now closer than ever to a reality where they can simply cast spells and command software to manifest complex functionalities, provided they maintain the vigilance and skill to guide their digital apprentices effectively.
Navigating Agent Frustration and Context Challenges
We have certainly reached a point where software engineering feels like casting spells, and it is fascinating to see how the metaphor of the wizard book persists. As these tools become more central to our work, many developers are experiencing stress when their agents fail to perform as expected. Firing off various agents only to find one stalling or failing can feel like a waste of time.
This tension is actually where the most interesting progress is happening. Because these models and tools are not yet perfect, we are in a phase of figuring out the best paradigms for interaction. Internally at OpenAI, we have a team conducting an experiment where they maintain a codebase written entirely by Codex. Unlike typical workflows where a developer might rely on an escape hatch—such as rolling up their sleeves to fix the code manually or using simple tab-completion—this team is committed to the agent. When they cannot get the agent to build a feature, they are forced to confront the limitation head-on.
We have found that in the vast majority of cases, when a coding agent is not performing as intended, the issue is an underspecification of requirements or a lack of sufficient context. The information simply is not available to the model. The solution is not to bypass the agent, but to improve the documentation and encode the tribal knowledge currently trapped in the developer's head into the codebase itself. This can be done through code comments, explicit code structures, or additional resource files like Markdown documentation and skill sets within the repository. Removing the escape hatch has forced this team to piece together the best practices required to truly lean into agent-based development.
Scaling Code Review in an Agent-Driven Workflow
As developers use AI to ship pull requests at a much higher frequency, the bottleneck of manual code review becomes a critical challenge. We are actively working to ensure this does not create a tedious, unsustainable workload.
At this point, Codex reviews 100% of our internal pull requests. The most powerful realization is that we tend to offload the tasks that annoy us the most—the most boring parts of software engineering—to the models. This has made the profession more enjoyable because engineers can focus on the creative aspects of their work.
Personally, I historically disliked code reviews. In my first job out of college at Quora, I owned the code for the newsfeed, which was a central component touched by almost everyone. Every morning, I would log in to face twenty to thirty code reviews, which felt like a massive drain on productivity. By shifting that burden to AI agents, we can scale our output while freeing engineers from the most repetitive aspects of the job.
Automated Code Review and CI Processes
Codex has proven highly adept at reviewing code, particularly when steered correctly. By integrating Codex into the workflow, code review has shifted from a time-consuming 10 to 15-minute task into a quick 2 to 3-minute process, as numerous suggestions are already baked in. For many smaller pull requests, the author can trust the model's review, effectively using Codex as a smart second pair of eyes to prevent basic errors. While attention to pull requests has not dropped to zero, it has shifted from 100 percent human focus to approximately 30 percent, significantly accelerating the deployment pipeline.
This automation extends to the broader CI and post-push deployment processes. Engineering teams often find the transition from writing code to getting it into production—managing tests, linting, and reviews—to be the most tedious part of the job. By building internal tools to automate these steps, such as using Codex to automatically patch linting errors and restart CI, the team has minimized the manual workload. Consequently, engineers are able to merge and push significantly more pull requests. Regarding the use of multiple models, the team frequently tests various internal model variants to gain different perspectives, choosing to prioritize dogfooding their own technology rather than relying heavily on external models.
AI Attribution in Production
While it is difficult to provide an exact percentage regarding AI attribution for code currently running in production, almost every engineer at OpenAI heavily utilizes Codex across their tasks. The vast majority of code authored at this point can likely be attributed to the model.
The Evolving Role of Engineering Management
The role of an engineering manager has undergone fewer immediate changes compared to that of an individual contributor, as there is no specific "Codex for managers" yet. However, managers are beginning to see shifts in their day-to-day operations. While the transformation is not yet as radical as it is for developers, certain trends suggest how the role will evolve. Managers are already finding ways to leverage AI for administrative or management-heavy tasks, and by observing the current trajectory, the future direction of AI-assisted management is becoming increasingly clear.
Managing in an AI-Driven Environment
While the daily responsibilities of an engineering manager have not undergone as drastic a transformation as those of individual contributors, significant trends are beginning to emerge. Managers are finding that AI tools, such as Codex, provide meaningful leverage for management tasks, even if there is no direct equivalent to code generation for managerial functions just yet.
Empowering High Performers
A central observation is that AI tools disproportionately empower top performers, allowing them to achieve significantly higher levels of productivity. Because these individuals tend to possess high agency and lean into new technologies, they effectively supercharge their own output. This has led to a wider performance spread across teams.
As a result, a core management philosophy of dedicating the majority of time to top performers—ensuring they remain unblocked, satisfied, and heard—is becoming even more critical. Teams that are fully embracing AI-generated codebases have shown that giving high performers the autonomy to experiment with these tools pays substantial dividends. Supporting these individuals as they define and share new best practices for AI interaction serves to elevate the entire organization.
Increasing Managerial Leverage
Managers are also finding ways to utilize AI for their own operational tasks. By integrating AI models like ChatGPT with organizational knowledge—such as access to GitHub repositories, Notion documentation, and internal reports—managers can conduct research and synthesize context far more efficiently. A prime example is the performance review process; AI can aggregate a person's contributions over the past year and generate comprehensive summaries, which significantly reduces the administrative burden of these evaluations.
This increased leverage suggests that the future of management may involve overseeing much larger teams than current best practices allow. While the standard span of control for software engineering is typically six to eight direct reports, AI-augmented management could reasonably support significantly larger groups. This shift is already being mirrored in non-engineering domains like operations and support, where agents are handling more complex workflows and enabling managers to oversee broader sets of activities.
Ultimately, the consensus remains that while AI makes good employees better, it makes great people truly exceptional. By leaning into these tools, managers can focus their energy on high-leverage activities, enabling their teams to operate with greater clarity and efficiency.
The Rise of the One-Person Billion-Dollar Startup
One of the most compelling narratives to emerge from the current artificial intelligence wave is the concept of the one-person billion-dollar startup. While the idea that a single individual can wield enough agency and leverage through AI tools to build a business of that magnitude is fascinating, many observers are failing to price in the broader second and third-order effects of this shift.
A Potential Startup Boom
The primary implication of the one-person billion-dollar company is that it reflects a massive increase in individual leverage. If it is possible for one person to reach that scale, it becomes exponentially easier for people to launch startups in general. This suggests we may be heading toward a significant explosion in startup and small-to-medium business creation, as the barriers to building custom software for any niche or domain continue to collapse.
We are already observing this trend in the AI startup ecosystem, where software is becoming increasingly vertical-oriented. By leaning into a specific domain and deeply understanding its unique use cases, these focused AI tools have proven effective. If this pattern scales, there is no reason we could not see a hundredfold increase in the number of these specialized startups.
A Golden Age for B2B SaaS
It is entirely possible that to support these one-person billion-dollar entities, a new ecosystem of a hundred other small startups will emerge to build the bespoke, highly effective software required to support them. We may well be entering a golden age for B2B software as a service and startups more broadly. As building software and running a company becomes progressively easier, the sheer volume of new businesses will likely rise.
While the headline-grabbing goal is the one-person billion-dollar startup, the more realistic landscape likely includes a hundred startups valued at 100 million dollars and tens of thousands of businesses valued at 10 million dollars. For the individual founder, building a 10-million-dollar business is a life-changing outcome.
Shifts in the Venture Ecosystem
The third-order effects introduce more uncertainty. If we transition to a world dominated by micro-companies building software for one or two people who also act as the owners and operators, the traditional startup and venture capital ecosystems will be forced to adapt.
We might end up in a world where only a handful of major players remain to offer the platforms that support this vast array of smaller startups. Furthermore, the number of venture-scale companies capable of providing 100x or 1000x returns might actually shrink if the market becomes saturated with smaller, 10-to-50-million-dollar businesses. While these smaller entities may not align with traditional venture capital return profiles, they represent a massive win for the high-agency individuals who are using AI to build sustainable, independent businesses for themselves.
Scaling Challenges for Billion-Dollar Startups
The notion of a billion-dollar, one-person startup faces a significant hurdle: support volume. Even with AI assistance, handling support requests remains difficult to scale. Unless a company maintains a very high average contract value (ACV) with a limited number of customers, dealing with the volume of inquiries—often regarding trivial issues—is a bottleneck. It is highly challenging to operate as a true one-person company at that scale without relying on contractors, which raises the question of whether that still qualifies as a single-person entity.
The Rise of Niche Support Ecosystems
A different perspective on this challenge is that the market may evolve to support these solo founders through a proliferation of specialized software. Rather than a founder personally managing an AI agent to triage support, there may emerge a wave of startups building software tailored specifically for needs like podcasts and newsletters. A solo founder could purchase these highly targeted tools—potentially built by other one-person startups—to automate and manage these operational complexities.
Because the cost of building software and products is collapsing, founders may increasingly outsource these functions, effectively reducing the internal size of their companies. While the outcome remains uncertain, it is plausible that a single individual could manage a massive, highly leveraged company that reaches a billion-dollar valuation by integrating these specialized, outsourced solutions.
The Value of Distribution
The increasing density of new products creates intense competition for user attention. Consequently, distribution is becoming a critical fourth-order effect. Individuals who possess an established audience or platform will become increasingly valuable in this environment.
Management Lessons for Scaling
Regarding management, the strategy of dedicating more time to top performers remains a highly effective principle. While management philosophies naturally evolve, certain core tenets remain consistent even when leading teams building foundational infrastructure—such as an API that powers a significant portion of the AI economy. Success in managing engineers and high-agency individuals often relies on these fundamental, time-tested practices rather than constantly shifting styles.
Empowering Engineers as Surgeons
My management philosophy has evolved over time, but one core principle remains constant: dedicate over 50% of your time to your top 10% of performers. The primary objective is to empower these individuals by ensuring they have everything they need to execute their work effectively.
I find the "surgeon" analogy from The Mythical Man-Month, a book published in the 1970s, to be an incredibly powerful framework. The authors predicted a future where software engineering would mirror the environment of an operating room. In this model, there is one person—the surgeon—performing the core work, while everyone else in the room acts as support staff. Whether it is a nurse, resident, or fellow, their sole purpose is to provide the surgeon with the exact tools, machines, or assistance required at any given moment.
While modern software development is inherently collaborative rather than a single-person endeavor, I strive to emulate this "surgeon support" model in my management style. I want every person on my team to feel like a surgeon, supported by an army of people who are looking around corners to provide them with the resources they need before they even have to ask.
Looking Around Corners
One of the most effective ways a manager can provide value is by proactively identifying and removing blockers, especially from an organizational or process-oriented perspective. In today's AI-driven landscape, this is even more critical. If engineers are focused on shipping code and cranking out pull request after pull request, the primary bottlenecks to progress are often organizational.
If an engineer needs a scalpel, the best-case scenario is that the manager has already anticipated that need and has the tool ready. This ability to look around corners and unblock the team is the cornerstone of my approach to engineering management.
AI-Assisted Management
There is significant potential for AI to assist in this "looking around corners" process. While I have not yet implemented this, it is fascinating to consider hooking an AI model, such as ChatGPT, into company knowledge sources—like Notion documentation or Slack conversations—to analyze team activity.
One could potentially ask the AI to identify active blockers across the team or, even more powerfully, to anticipate future challenges. By asking the AI to perform second- and third-order analysis, a manager could potentially predict what will block a specific engineer or team in the coming months. This proactive, AI-augmented foresight could be the next step in evolving how we support and unblock our engineers.
Challenges in AI Deployment and Achieving Positive ROI
The discussion surrounding the effectiveness of AI deployments often leads to the question of return on investment. While precise quantitative data is difficult to track, there is a strong suspicion that many current AI deployments are yielding negative ROI. This issue is partly reflected in a broader sentiment from individuals outside the tech industry who feel that AI is being forced upon them.
The Silicon Valley Bubble
A primary driver of these ineffective deployments is a lack of awareness regarding how AI is truly perceived and utilized outside of specific hubs. It is easy for those within the technology sector, particularly in Silicon Valley or on platforms like X, to forget that they operate within a bubble. Most people in the United States and globally are not software engineers, are not deeply engaged with every new model release, and are not well-versed in best practices for utilizing AI technology.
There is a significant disconnect between the power users who lean into advanced skills, agents, and MCPs (Model Context Protocols), and the average employee tasked with using AI tools. In many companies, the implementation of AI is limited to very simple tasks because the staff lacks a foundational understanding of how the underlying technology functions. Consequently, these organizations are not yet pushing the boundaries of what is possible.
Achieving an Ideal AI Deployment
For an AI deployment to be successful, it requires a balanced approach that combines top-down support with bottom-up adoption. This hybrid model has proven effective within organizations like OpenAI.
- Top-down buy-in: Leadership must commit to the initiative, signaling an organizational shift toward becoming an AI-first company. This ensures that the necessary tools are procured and that there is active executive support for the transition.
- Bottom-up adoption: Equally critical is the engagement of employees who are actively performing the work. Success relies on staff members who are genuinely excited about the technology, willing to invest time in learning, and capable of evangelizing its use.
The most successful companies foster an environment where these employees can develop best practices and share their knowledge across the organization. Internal adoption at OpenAI truly accelerated with the introduction of tools like Codex, which enabled employees to start building and experimenting with the technology themselves rather than relying solely on external mandates.
Internalizing AI Deployment
The most effective approach to becoming an AI-first organization requires a synergy between top-down buy-in and bottom-up adoption. While executive support and the acquisition of tools are essential, success ultimately relies on employees performing the work who are genuinely enthusiastic about the technology, willing to learn, and capable of evangelizing its benefits.
This bottom-up energy is crucial because work is inherently unique across different departments, such as finance, operations, sales, and software engineering. Each function possesses specific, last-mile intricacies that require a granular, bottom-up effort to address.
The Dangers of Top-Down Mandates
When AI deployment is exclusively a top-down mandate, it often becomes disconnected from the reality of daily operations. Organizations may find themselves with a large workforce that does not understand the technology. Although employees might face pressure to use AI—perhaps due to expectations in performance reviews—they often lack the knowledge or guidance to apply it effectively. In the absence of internal experts or peers to learn from, the adoption often stalls.
Building an Internal Tiger Team
Companies pushing for AI integration should identify or staff a full-time, internal tiger team. This group is tasked with:
- Exploring the full extent of model capabilities.
- Applying the technology to specific, practical workflows.
- Facilitating knowledge sharing.
- Generating excitement among the broader workforce.
Without such a dedicated team, it is often extremely difficult for the average employee to pick up these tools successfully.
Staffing the Evangelist Team
Interestingly, these teams are often not led by software engineers. While engineers understand the technology, they are frequently in short supply. Instead, the most successful internal champions are often technical-adjacent personnel—individuals such as operations leads or support staff who may not write code but possess high technical aptitude, like an Excel wizard. These are the individuals who are most likely to get excited about the potential of AI tools.
Management should look for high performers who are naturally gravitating toward AI adoption and empower them to lead the charge. This can be accomplished through activities such as hosting hackathons, organizing seminars, and leading knowledge-sharing sessions to cultivate the seeds of internal innovation.
The AI Customer Feedback Trap
A common question regarding strategy involves the role of customer feedback. While it is generally useful to listen to customers, there is a nuance in the AI field that can be misleading. Because the models and the underlying technology have evolved so rapidly over the last three years—frequently disrupting themselves, particularly in the realms of tooling and scaffolding—relying solely on traditional customer feedback can sometimes lead an organization astray.
The Scaffolding Trap
A core challenge in the current artificial intelligence landscape is that the models themselves evolve with such rapid velocity that they frequently disrupt their own support structures. This phenomenon is often summarized by the phrase that models will eat your scaffolding for breakfast.
Looking back at the launch of ChatGPT in 2022, the models were comparatively raw. This necessitated a massive influx of developer-focused product scaffolding—tools, agent frameworks, and vector stores—designed to steer the model and force it to perform specific tasks. At the time, embedding entire corporate datasets into vector stores and engineering complex search optimizations seemed like the definitive way to provide organizational context.
However, as the underlying models grew increasingly capable, they effectively rendered much of that intricate scaffolding unnecessary. A superior approach often involves stripping away that complex logic and instead trusting the model to handle information access through simpler tools. For instance, rather than relying exclusively on a vector store, a model can be connected to more standard file systems or straightforward search utilities.
The Danger of Local Maxima
This rapid evolution creates a nuanced problem when interacting with customers. If you listen exclusively to customer feedback, you risk being led into a local maximum. A customer might fervently request a more advanced agent framework or a more sophisticated vector store integration. If you blindly follow that request, you end up building features that are optimized for a paradigm that may soon be obsolete.
Because the field is a moving target, building in this space requires a delicate balancing act. You must reconcile the concrete desires of your customers with an informed perspective on where model capabilities are trending over the next one to two years.
The Applied Bitter Lesson
There is essentially a version of the bitter lesson—the fundamental realization in machine learning that simpler, compute-heavy approaches eventually outperform hand-crafted, complex architectures—applied specifically to building with AI. We often attempt to architect elaborate systems around these models, only to find that the models eventually absorb that functionality, rendering the extra architecture redundant. Even the OpenAI API team has been guilty of this cycle, as the constant advancement of the technology continually demands a re-evaluation of which abstractions and tools are truly necessary.
Designing for Future Model Capabilities
When developing products with current artificial intelligence, it is crucial to build for where the models are going, not just where they are today. A common trait among successful startups is designing for an ideal level of capability that may currently be only 80 percent achievable. While the product might function with some limitations in the present, these gaps often disappear as models improve, suddenly unlocking the full experience without requiring fundamental architectural changes. This forward-looking approach creates a superior user experience compared to designs that assume the underlying technology will remain static.
The Evolution of Task Duration
A key trend to watch over the next 12 to 18 months is the dramatic increase in the duration over which models can perform tasks coherently. Current software engineering benchmarks indicate that frontier models can already complete multi-hour tasks roughly 50 percent of the time, with tasks lasting just under an hour achieving an 80 percent success rate.
Most products today are heavily optimized for tasks that last only a few minutes, essentially serving as interactive assistants. However, as model coherence scales, we will see the emergence of agents capable of handling multi-hour or even day-long tasks independently. Once models can reliably manage work over a six-hour timeframe, the paradigm of product design will shift from simple interaction to delegation. This transition necessitates new interfaces where users can effectively supervise and provide feedback to agents without needing to monitor every minor step.
Advancements in Multimodality
Beyond raw task duration, significant improvements are expected in multimodal models, with a particular emphasis on audio. While existing models have reached a baseline level of competency, native speech-to-speech models are poised to become significantly more capable over the coming 6 to 12 months. Audio remains a hugely underrated domain, especially within enterprise and business contexts. While industry attention is often dominated by coding assistants, the integration of advanced audio processing represents a massive, untapped opportunity for innovation.
Expanding Multimodal Capabilities
The next 12 to 18 months will likely see significant improvements in multimodal models, particularly regarding audio. While models currently handle audio quite well, they are poised to become substantially better. This is especially true for native multimodal models that facilitate speech-to-speech interaction. Additionally, interesting developments are emerging around new types of models and architectures specifically for multimodal audio.
Audio remains a hugely underrated domain, particularly in enterprise and business settings. While the technology discourse is heavily focused on coding and text, a vast portion of global business, services, and operations is conducted via audio and verbal communication. This area is expected to be very exciting in the coming 18 months, with substantial new opportunities for what can be achieved using audio models.
In summary, the trajectory for AI tools and agents to run for longer durations will continue to increase, and audio and speech will become more native, first-party, and central to the core user experience.
The Opportunity in Business Process Automation
A major area of focus—and a significant opportunity for AI impact—is business process automation. A common tendency in Silicon Valley is to view the world through a bubble where work is primarily defined by software engineering, product management, and building products. These activities are characterized by open-ended knowledge work, which is not highly repeatable.
However, the broader economy is structured very differently. Outside of core tech companies, a substantial amount of work consists of business processes. These are repeatable operations, often governed by standard operating procedures that organizations aim to follow with high consistency. Unlike software engineering, where the value is often found in novelty or unique solutions, the goal in these business operations is to avoid unnecessary deviation.
This category of work includes everything from support lines to interactions with utility companies, where there is a rigid set of processes and defined actions. The general category of business process automation is currently underrated because it falls outside the typical wheelhouse of tech-centric thinking.
There is significant potential to apply AI tools and frameworks toward making these repeatable, highly deterministic processes easier to manage. The key is to create systems that are fully integrated with business data and decision-making structures within an enterprise. While software engineering is part of the transformation story, the impact on the business process side may ultimately look even more transformative. Whether the total opportunity on the business process side is larger or smaller than that of software engineering remains to be seen, but the work to be done in this domain is undoubtedly substantial.
Business Process Automation and Startup Strategy
Beyond the obvious impact on software engineering, artificial intelligence is poised to transform business processes. While the scope of this shift is difficult to quantify precisely, it is undeniably massive—likely rivaling or exceeding the impact on coding itself. Much of the world's work is currently structured in a way that is highly susceptible to AI-driven automation, and this area represents a substantial opportunity for AI to change how work is done within large enterprises.
Navigating Competition with OpenAI
A common concern for developers and founders is the fear that OpenAI will release its own product, squashing their ideas and destroying their market. However, the market for AI applications is vast, and the success or failure of startups is rarely determined by competition from large labs like OpenAI or Google. Instead, startups that fail usually do so because their product failed to resonate with customers. Conversely, companies that build products people truly love can succeed even in highly competitive spaces.
The current opportunity space for building with AI is unprecedented. This shift is evident in the behavior of venture capitalists, who are now aggressively investing in competing companies across the same sectors. This environment is incredibly empowering for startups. Even if you only build something that a specific group of people truly loves, the potential to create a massively valuable business is significant.
Commitment to an Ecosystem Platform
From the beginning, OpenAI has viewed itself as an ecosystem platform company, with the API serving as its first product. This is not just a tactical decision but a core part of the organization's charter and mission. OpenAI is committed to fostering this ecosystem rather than competing with or squashing the developers who build upon it.
This philosophy is reflected in the company's operational decisions:
- Neutral Platform: OpenAI works to keep its platform neutral and does not block competitors from accessing its models.
- Release Parity: Every model released in an OpenAI product is also released via the API. Even models optimized for specific tasks, such as those within the Codex harness, are made available to all customers.
- Ecosystem Growth: Features like Sign in with ChatGPT are examples of how the company intends to support developers and foster a broader ecosystem.
The guiding principle is that a rising tide lifts all boats. While OpenAI has grown into a large organization, it remains committed to raising the tide for everyone involved. The API has grown significantly by adhering to this approach, and founders are encouraged to focus on building value rather than worrying about potential displacement by the platform provider. This commitment to an open ecosystem remains a central pillar of the organization's vision for achieving its long-term mission of building AGI.
The Platform Vision
The commitment to building a platform is not a recent pivot; it has been a core part of the OpenAI vision from the very beginning. This approach is directly rooted in the company's charter and mission. While the primary mission is to build AGI, the second, equally vital component is to spread the benefits of that technology to all of humanity.
Recognizing that OpenAI as a single company could not possibly reach every corner of the globe or address every specific need, the team launched the API as early as 2020. The goal was to empower others to build specialized tools—such as customer support bots for podcasters or resources for newsletter hosts—that the OpenAI team could not feasibly build themselves. This philosophy remains a fundamental expression of their mission, and the company continues to prioritize fostering an open ecosystem where a diversity of businesses can thrive.
Scaling Through ChatGPT
The ChatGPT app store represents another extension of this platform strategy. While managed by a different team, the ChatGPT organization collaborates closely with the API team, including the development of an apps SDK. With roughly 800 million weekly active users, ChatGPT has become an immense asset. By allowing external companies to build for this vast audience, the platform not only provides value to developers but also expands the utility and reach of the service itself.
The scale of 800 million weekly active users is, by any measure, unprecedented. The leadership views this as representing roughly 10% of the world population, and the number is continuing to grow rapidly. From a scaling perspective, the growth is described as mind-boggling, reflecting a global shift in how people interact with technology.
Democratizing Intelligence
A central critique often leveled at the organization involves the costs associated with its services. However, the company emphasizes the existence of a free version of ChatGPT that is accessible to anyone, regardless of their location or economic status. This free offering is not significantly different from the most powerful AI models currently available. Whether one is a billionaire or a person in a remote village, the core intelligence available to them remains remarkably consistent.
This strategy of raising the floor for everyone is intentional. The free model has seen massive improvements since 2022, and today’s iteration is vastly more capable than its predecessors. By offering access to essentially the same advanced models as those used by the wealthiest individuals for a nominal monthly fee—or entirely for free—OpenAI aims to ensure that powerful technology is not gated by wealth, ultimately fulfilling its commitment to distribute the benefits of AI broadly across humanity.
API Platform and Development Tools
The company prioritizes the democratization of artificial intelligence, aiming to spread these benefits globally as part of its core mission. For developers interested in building applications, the platform offers a comprehensive stack designed to support various levels of abstraction, depending on how much guidance or control is needed.
Core Development Primitives
At the lowest level, the Responses API serves as the most popular developer endpoint. This endpoint is specifically optimized for building long-running agents. It functions as a foundational primitive where developers provide the model with text, poll it to monitor progress, and eventually receive the model response. Because it is highly unopinionated, this tool allows for maximum flexibility, enabling developers to build almost anything they desire.
Agent SDK and Orchestration
For those seeking more structure, the Agents SDK provides a higher layer of abstraction. This tool has become extremely popular for building complex agents—entities that operate in what is essentially an infinite loop. The SDK includes necessary scaffolding for implementing guard rails, allowing an agent to delegate subtasks to other agents and orchestrate an entire swarm of them.
Deployment and Evaluation
The platform also provides meta-level tools to assist with the deployment and validation of these applications:
- Agent Kit and Widgets: This suite includes various user interface components, allowing developers to quickly build beautiful, consistent interfaces on top of the API or the Agents SDK.
- Eval API: This product provides a quantitative way to test workflows, models, and agents, ensuring that developers can verify their systems are functioning as intended.
Developers can utilize the entire stack to build agents rapidly or move down the stack to the Responses API for lower-level, bespoke implementations.
Looking Ahead
The next two to three years represent an exceptionally energetic and fun period for both the tech industry and the startup world. While acknowledging that the current wave of innovation will eventually transition into a more incremental phase, there is a clear encouragement to fully capitalize on this time. Having experienced more stagnant periods in tech, the current environment is viewed as the most exciting of the past decade, and there is a strong call to invent, explore, and not take this period of rapid advancement for granted.
Overcoming Anxiety and Staying Engaged
When discussing how to navigate the overwhelming pace of the current tech landscape without feeling constant anxiety, the advice is simple: don't try to absorb everything. The combination of a frenetic industry and social platforms creates a constant stream of news that can be incredibly overwhelming, but most of it is just noise.
You do not need to follow every update to engage with the technology meaningfully. Instead, start small. Focus on leaning into one or two specific tools. Simply installing a client, playing around with it, or connecting a tool to a few internal data sources—such as Notion, Slack, or GitHub—to observe what it can and cannot do, is more than sufficient. The core goal is to get familiar with the technology and understand its capabilities and limitations so that you remain an active participant rather than letting the wave pass you by.
Lightning Round: Book Recommendations
When asked for book recommendations, the following titles stand out as particularly fascinating and insightful.
Fiction
There Is No Antimemetics Division by qntm is a highly recommended science fiction work. It tells the story of a government agency tasked with fighting entities that cause people to forget them. It is a smart, creative, and fresh piece of writing that is also, perhaps unintentionally, hilarious at times despite its horror-leaning themes. It is a gripping read that can be devoured in a very short time.
Non-Fiction
In the realm of non-fiction, there are two recent books that have been eye-opening regarding US-China relations:
-
Breakneck by Dan Wang: This book is praised for its clear-eyed analysis of the societal differences between the two nations. It introduces a compelling analogy, framing the United States as a lawyerly society while describing China as an engineering society, each with its own set of distinct pros and cons.
-
The second recommendation in this category continues to explore these geopolitical and systemic dynamics, offering further perspective on the ongoing shifts between the two powers.
Technology and Media Recommendations
Beyond books, finding time for other media is challenging given professional and personal responsibilities. However, anime remains a preferred choice for its unique and novel plots that western media often avoids. Recently, watching the third season of Jujutsu Kaisen was a highlight.
When it comes to home networking technology, Ubiquiti is a standout recommendation. The system is comparable to the Apple of home networking, featuring well-built hardware paired with excellent software, including a highly effective mobile app. While these systems require Ethernet wiring throughout a home, the ecosystem—particularly the security cameras—provides an incredible user experience. Managing live feeds via dedicated apps for mobile devices, iPad, or Apple TV is exceptionally smooth, making it a worthwhile investment.
Life Philosophy and Real Estate Insights
In both professional and personal challenges, a core principle is to never feel sorry for oneself. Maintaining a sense of agency to overcome obstacles is a crucial mindset to cultivate and share with others.
Drawing from experience in real estate valuation—specifically in developing models to determine house pricing—several factors were surprisingly influential. Two variables stood out:
First, the proximity to high-voltage power lines has a significant negative impact on property value. Observing homes in markets like Dallas revealed that the physical presence of these lines, along with the associated buzzing noise and parental concerns for child safety, creates a substantial deterrent for buyers.
Second, floor plans are critical to valuation but notoriously difficult to quantify. Distinguishing between a highly desirable floor plan and one that is poorly laid out remains one of the most complex challenges in automated property appraisal.
Challenges in Quantifying Home Features
Quantifying the quality of a floor plan proved to be a major hurdle. It involved assessing various dimensions and layouts, such as kitchen width, kitchen style, and the placement of the master bedroom. When a home struggled to sell, the operations team would often identify the floor plan as the culprit. Because the data was rarely digitized—often residing on physical paper records for homes in places like Phoenix and Dallas—the assessment often relied on the subjective experience of walking through the home and feeling the flow.
Curb Appeal and Investment Impact
The impact of curb appeal, particularly the front door, was another factor that proved more significant than initially anticipated. There is existing Zillow research suggesting that replacing a front door offers one of the highest returns on investment for a property. The psychological experience of a buyer as they walk up to a home and interact with those first moments of the property is a critical element that is often underrated in valuation models.
Connecting with Sherwin Woo
For those interested in following or reaching out, Sherwin Woo is active on X (formerly Twitter) under the handle @SherwinWoo. He primarily shares updates regarding OpenAI, their API, and new product launches. He encourages those building startups or hacking on new ideas to reach out to him on the platform to share what they are working on and to discuss how OpenAI might be able to support their efforts.